skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Moussa, Marmar R"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Background: Colorectal cancer (CRC) is a term that refers to the combination of colon and rectal cancer as they are being treated as a single tumor. In CRC, 72% of tumors are colon cancer, while the other 28% represent rectal cancer. CRC is a multifactorial disease caused by both genetic and epigenetic changes in the colon mucosal cells, affecting the oncogenes, DNA repair genes, and tumor suppressor genes. Currently, two DNA methylation-based biomarkers for CRC have received FDA approval: SEPT9, used in blood-based screening tests, and a combination of NDRG4 and BMP3 for stool-based tests. Although DNA methylation biomarkers have been explored in colorectal cancer (CRC), the identification of robust and clinically valuable biomarkers remains a challenge, particularly for early-stage detection and precancerous lesions. Patients often receive diagnoses at the locally advanced stage, which limits the potential utility of current biomarkers in clinical settings. Methods: The datasets used in this study were retrieved from the GEO database, specifically GSE75548 and GSE75546 for rectal cancer and GSE50760 and GSE101764 for colon cancer, summing up to a total of 130 paired samples. These datasets represent expression profiling by array, methylation profiling by genome tiling array, and expression profiling by high-throughput sequencing and include rectal and colon cancer samples paired with adjacent normal tissue samples. Differential analysis was used to identify differentially methylated CPG sites (DMCs) and identify differentially expressed genes (DEGs). Results: From the integration of DMCs with DEGs in colorectal cancer, we identified 150 candidates for methylation-regulated genes (MRGs) with two genes common across all cohorts (GNG7 and PDX1) highlighted as candidate biomarkers in CRC. The functional enrichment analysis and protein–protein interactions (PPIs) identified relevant pathways involved in CRC, including the Wnt signaling pathway, extracellular matrix (ECM) organization, among other enriched pathways. Conclusions: Our findings show the strength of our in silco computational approach in jointly identifying methylation-regulated biomarkers for colon cancer and highlight several genes and pathways as biomarker candidates for further investigations. 
    more » « less
    Free, publicly-accessible full text available June 1, 2026
  2. Free, publicly-accessible full text available February 21, 2026
  3. Abstract Principal Component Analysis (PCA) has long been a cornerstone in dimensionality reduction for high-dimensional data, including single-cell RNA sequencing (scRNA-seq). However, PCA’s performance typically degrades with increasing data size, can be sensitive to outliers, and assumes linearity. Recently, Random Projection (RP) methods have emerged as promising alternatives, addressing some of these limitations. This study systematically and comprehensively evaluates PCA and RP approaches, including Singular Value Decomposition (SVD) and randomized SVD, alongside Sparse and Gaussian Random Projection algorithms, with a focus on computational efficiency and downstream analysis effectiveness. We benchmark performance using multiple scRNA-seq datasets including labeled and unlabeled publicly available datasets. We apply Hierarchical Clustering and Spherical K-Means clustering algorithms to assess downstream clustering quality. For labeled datasets, clustering accuracy is measured using the Hungarian algorithm and Mutual Information. For unlabeled datasets, the Dunn Index and Gap Statistic capture cluster separation. Across both dataset types, the Within-Cluster Sum of Squares (WCSS) metric is used to assess variability. Additionally, locality preservation is examined, with RP outperforming PCA in several of the evaluated metrics. Our results demonstrate that RP not only surpasses PCA in computational speed but also rivals and, in some cases, exceeds PCA in preserving data variability and clustering quality. By providing a thorough benchmarking of PCA and RP methods, this work offers valuable insights into selecting optimal dimensionality reduction techniques, balancing computational performance, scalability, and the quality of downstream analyses. 
    more » « less
    Free, publicly-accessible full text available February 8, 2026